machine type
Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training
Fang, Xin, Zhong, Guirui, Wang, Qing, Chu, Fan, Wang, Lei, Qian, Mengui, Cai, Mingqi, Wu, Jiangzhao, Gao, Jianqing, Du, Jun
ABSTRACT Anomalous Sound Detection (ASD) is often formulated as a machine attribute classification task, a strategy necessitated by the common scenario where only normal data is available for training. However, the exhaustive collection of machine attribute labels is laborious and impractical. To address the challenge of missing attribute labels, this paper proposes an agglomerative hierarchical clustering method for the assignment of pseudo-attribute labels using representations derived from a domain-adaptive pre-trained model, which are expected to capture machine attribute characteristics. We then apply model adaptation to this pre-trained model through supervised fine-tuning for machine attribute classification, resulting in a new state-of-the-art performance. Evaluation on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2025 Challenge dataset demonstrates that our proposed approach yields significant performance gains, ultimately outperforming our previous top-ranking system in the challenge.
- Asia > China > Anhui Province > Hefei (0.05)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
Meng, Xinxin, Guo, Jiangtao, Zhang, Yunxiang, Huang, Shun
The outlier exposure method is an effective approach to address the unsupervised anomaly sound detection problem. The key focus of this method is how to make the model learn the distribution space of normal data. Based on biological perception and data analysis, it is found that anomalous audio and noise often have higher frequencies. Therefore, we propose a data augmentation method for high-frequency information in contrastive learning. This enables the model to pay more attention to the low-frequency information of the audio, which represents the normal operational mode of the machine. We evaluated the proposed method on the DCASE 2020 Task 2. The results showed that our method outperformed other contrastive learning methods used on this dataset. We also evaluated the generalizability of our method on the DCASE 2022 Task 2 dataset.
- Asia > China > Fujian Province > Xiamen (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection
Ma, Chengyuan, Jia, Peng, Guo, Hongyue, Yang, Wenming
The core challenge in industrial equipment anoma lous sound detection (ASD) lies in modeling the time-frequency coupling characteristics of acoustic features. Existing modeling methods are limited by local receptive fields, making it difficult to capture long-range temporal patterns and cross-band dynamic coupling effects in machine acoustic features. In this paper, we propose a novel framework, ESTM, which is based on a dual-path Mamba architecture with time-frequency decoupled modeling and utilizes Selective State-Space Models (SSM) for long-range sequence modeling. ESTM extracts rich feature representations from different time segments and frequency bands by fusing enhanced Mel spectrograms and raw audio features, while further improving sensitivity to anomalous patterns through the TriStat-Gating (TSG) module. Our experiments demonstrate that ESTM improves anomalous detection performance on the DCASE 2020 Task 2 dataset, further validating the effectiveness of the proposed method.
- Asia > China > Liaoning Province > Dalian (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (3 more...)
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained Models
Zhong, Guirui, Wang, Qing, Du, Jun, Wang, Lei, Cai, Mingqi, Fang, Xin
Anomalous Sound Detection (ASD) aims at identifying anomalous sounds from machines and has gained extensive research interests from both academia and industry. However, the uncertainty of anomaly location and much redundant information such as noise in machine sounds hinder the improvement of ASD system performance. This paper proposes a novel audio feature of filter banks with evenly distributed intervals, ensuring equal attention to all frequency ranges in the audio, which enhances the detection of anomalies in machine sounds. Moreover, based on pre-trained models, this paper presents a parameter-free feature enhancement approach to remove redundant information in machine audio. It is believed that this parameter-free strategy facilitates the effective transfer of universal knowledge from pre-trained tasks to the ASD task during model fine-tuning. Evaluation results on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2024 Challenge dataset demonstrate significant improvements in ASD performance with our proposed methods.
MIMII-Agent: Leveraging LLMs with Function Calling for Relative Evaluation of Anomalous Sound Detection
Purohit, Harsh, Nishida, Tomoya, Dohi, Kota, Endo, Takashi, Kawaguchi, Yohei
This paper proposes a method for generating machine-type-specific anomalies to evaluate the relative performance of unsupervised anomalous sound detection (UASD) systems across different machine types, even in the absence of real anomaly sound data. Conventional keyword-based data augmentation methods often produce unrealistic sounds due to their reliance on manually defined labels, limiting scalability as machine types and anomaly patterns diversify. Advanced audio generative models, such as MIMII-Gen, show promise but typically depend on anomalous training data, making them less effective when diverse anomalous examples are unavailable. To address these limitations, we propose a novel synthesis approach leveraging large language models (LLMs) to interpret textual descriptions of faults and automatically select audio transformation functions, converting normal machine sounds into diverse and plausible anomalous sounds. We validate this approach by evaluating a UASD system trained only on normal sounds from five machine types, using both real and synthetic anomaly data. Experimental results reveal consistent trends in relative detection difficulty across machine types between synthetic and real anomalies. This finding supports our hypothesis and highlights the effectiveness of the proposed LLM-based synthesis approach for relative evaluation of UASD systems.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States (0.05)
Designing Cellular Manufacturing System in Presence of Alternative Process Plans
Uddin, Md. Kutub, Islam, Md. Saiful, Jahin, Md Abrar, Irfan, Md. Tanjid Hossen, Seam, Md. Saiful Islam, Mridha, M. F.
In the design of cellular manufacturing systems (CMS), numerous technological and managerial decisions must be made at both the design and operational stages. The first step in designing a CMS involves grouping parts and machines. In this paper, four integer programming formulations are presented for grouping parts and machines in a CMS at both the design and operational levels for a generalized grouping problem, where each part has more than one process plan, and each operation of a process plan can be performed on more than one machine. The minimization of inter-cell and intra-cell movements is achieved by assigning the maximum possible number of consecutive operations of a part type to the same cell and to the same machine, respectively. The suitability of minimizing inter-cell and intra-cell movements as an objective, compared to other objectives such as minimizing investment costs on machines, operating costs, etc., is discussed. Numerical examples are included to illustrate the workings of the formulations.
- Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
ASD-Diffusion: Anomalous Sound Detection with Diffusion Models
Zhang, Fengrun, Xie, Xiang, Guo, Kai
Unsupervised Anomalous Sound Detection (ASD) aims to design a generalizable method that can be used to detect anomalies when only normal sounds are given. In this paper, Anomalous Sound Detection based on Diffusion Models (ASD-Diffusion) is proposed for ASD in real-world factories. In our pipeline, the anomalies in acoustic features are reconstructed from their noisy corrupted features into their approximate normal pattern. Secondly, a post-processing anomalies filter algorithm is proposed to detect anomalies that exhibit significant deviation from the original input after reconstruction. Furthermore, denoising diffusion implicit model is introduced to accelerate the inference speed by a longer sampling interval of the denoising process. The proposed method is innovative in the application of diffusion models as a new scheme. Experimental results on the development set of DCASE 2023 challenge task 2 outperform the baseline by 7.75%, demonstrating the effectiveness of the proposed method.
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Zheng, Xinhu, Jiang, Anbai, Han, Bing, Qian, Yanmin, Fan, Pingyi, Liu, Jia, Zhang, Wei-Qiang
Anomalous Sound Detection (ASD) has gained significant interest through the application of various Artificial Intelligence (AI) technologies in industrial settings. Though possessing great potential, ASD systems can hardly be readily deployed in real production sites due to the generalization problem, which is primarily caused by the difficulty of data collection and the complexity of environmental factors. This paper introduces a robust ASD model that leverages audio pre-trained models. Specifically, we fine-tune these models using machine operation data, employing SpecAug as a data augmentation strategy. Additionally, we investigate the impact of utilizing Low-Rank Adaptation (LoRA) tuning instead of full fine-tuning to address the problem of limited data for fine-tuning. Our experiments on the DCASE2023 Task 2 dataset establish a new benchmark of 77.75% on the evaluation set, with a significant improvement of 6.48% compared with previous state-of-the-art (SOTA) models, including top-tier traditional convolutional networks and speech pre-trained models, which demonstrates the effectiveness of audio pre-trained models with LoRA tuning. Ablation studies are also conducted to showcase the efficacy of the proposed scheme.
Deep Generic Representations for Domain-Generalized Anomalous Sound Detection
Saengthong, Phurich, Shinozaki, Takahiro
Developing a reliable anomalous sound detection (ASD) system requires robustness to noise, adaptation to domain shifts, and effective performance with limited training data. Current leading methods rely on extensive labeled data for each target machine type to train feature extractors using Outlier-Exposure (OE) techniques, yet their performance on the target domain remains sub-optimal. In this paper, we present \textit{GenRep}, which utilizes generic feature representations from a robust, large-scale pre-trained feature extractor combined with kNN for domain-generalized ASD, without the need for fine-tuning. \textit{GenRep} incorporates MemMixup, a simple approach for augmenting the target memory bank using nearest source samples, paired with a domain normalization technique to address the imbalance between source and target domains. \textit{GenRep} outperforms the best OE-based approach without a need for labeled data with an Official Score of 73.79\% on the DCASE2023T2 Eval set and demonstrates robustness under limited data scenarios. The code is available open-source.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Spain (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
CoopASD: Cooperative Machine Anomalous Sound Detection with Privacy Concerns
Jiang, Anbai, Shi, Yuchen, Fan, Pingyi, Zhang, Wei-Qiang, Liu, Jia
Machine anomalous sound detection (ASD) has emerged as one of the most promising applications in the Industrial Internet of Things (IIoT) due to its unprecedented efficacy in mitigating risks of malfunctions and promoting production efficiency. Previous works mainly investigated the machine ASD task under centralized settings. However, developing the ASD system under decentralized settings is crucial in practice, since the machine data are dispersed in various factories and the data should not be explicitly shared due to privacy concerns. To enable these factories to cooperatively develop a scalable ASD model while preserving their privacy, we propose a novel framework named CoopASD, where each factory trains an ASD model on its local dataset, and a central server aggregates these local models periodically. We employ a pre-trained model as the backbone of the ASD model to improve its robustness and develop specialized techniques to stabilize the model under a completely non-iid and domain shift setting. Compared with previous state-of-the-art (SOTA) models trained in centralized settings, CoopASD showcases competitive results with negligible degradation of 0.08%. We also conduct extensive ablation studies to demonstrate the effectiveness of CoopASD.